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1.  Introduction.  For  a  particular  absolutely  continuous  distribution 
F,  assumed  to  depend  on  a  location  or  scale  parameter  d,  the  insightful 
work  of  Adatla  and  Chan  (1981)  has  focused  on  the  equivalence  of  problems 
of  1}  optimal  grouping  for  i^xlmum  likelihood  estimation  of  0,  11)  optimal 
quantile  (optimal  spacing)  selection  for  the  asymptotically  best  linear 
unbiased  estimator  (ABLUE)  of  0  and  111)  optimal  stratification  for  estima¬ 
tion  of  a  population  mean.  In  this  paper  the  more  general  question  of 
when  different  distributions  have  equivalent  solutions  for  these  problems 
Is  considered  from  a  quantile  domain  perspective.  We  focus,  initially,  on 
the  latter  two  problems  which  can  be  stated  as  follows: 

Problem  1.  Select  percentile  points  0=Uq  <  u^^  <•••<  (frequently 

called  a  spacing)  corresponding  to  sample  quantiles  which  maximize  the 
asymptotic  relative  Fisher  efficiency  of  the  ABLUE  for  6  (cf.  Sarhan  and 
Greenberg  (1962,  Chap.  5)). 

Problem  2.  Given  strata  boundaries  <•••<  (where  a 

and  b  are  possibly  Infinite  values  which  bound  the  support  of  F) ,  a 
stratified  random  sample  of  size  n  is  to  be  selected  using  proportional 
allocation,  i.e.,  the  "number"  of  sample  elements  taken  from  (Xj^_j^,Xj^]  Is 
n[F(x^)-F(Xj^_j^)  ] .  If  0  Is  the  mean  for  F,  the  usual  estimator  of  0  Is 
0  *  ^^^j^[F(Xj^)-F(Xj^_j^)  ]X^  where  Is  the  sample  mean  from  the  ith  stratum. 
The  problem  is  to  select  the  boundaries  to  minimize  the  variance  of  0 
(cf.  Dalenius  (1950)).  Observe  that  when  F  Is  a  normal  distribution 
0  Is  a  location  parameter  whereas  for  the  exponential  distribution 
0  corresponds  to  a  scale  parameter. 

Both  Problems  1  and  2  are  nonlinear  In  nature  so  that,  as  a  rule,  their 
solutions  must  be  tabulated  numerically.  However,  In  some  Instances  It  has 
been  possible  to  exploit  relationships  between  different  types  of  dlstrl- 
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butlons  to  obtain,  for  example,  optimal  spacings  for  one  distribution 
In  terms  of  those  for  another  which  have  already  been  computed 
(cf.  Kulldorff  (1973)).  When  applicable,  this  approach  can  save  con¬ 
siderable  time,  effort  and  expense.  The  question  that  arises,  of  course. 

Is  when  and  under  what  conditions  can  such  tactics  be  expected  to  work  or 
fall.  Thus,  we  would  like  easily  checked  conditions  regarding  the  equi¬ 
valence  (or  non-equivalence)  of  optimal  spacing  or  stratification  problems 
for  different  distribution  types.  Motivated  by  such  considerations  we 
will  study  the  relationship  between  two  distributions  having  the  same 
solutions  for  either  of  Problems  1  or  2.  Our  major  results  in  this  regard 
(Theorems  1-3)  are  stated  and  discussed  in  the  next  section.  It  will  be 
seen  that  necessary  and  sufficient  conditions  for  the  solutions  of  any  of 
these  problems  to  coincide  for  two  distributions  can  be  succinctly  sum¬ 
marized  In  terms  of  relationships  between  their  quantile  and  denslty-quantlle 
functions.  Proofs  are  given  in  Section  3  with  Section  4  devoted  to  the 
application  of  results  In  Section  2  to  the  optimal  grouping  problem. 

2.  Optimal  Spacing  and  Stratification.  Let  Fj^(x;0j^)  and 
be  two  strictly  monotone,  continuously  differentiable,  distribution  functions 
(d.f.'s)  which  depend  on  parameters  6^  and  of  either  the  location  or 
scale  variety.  The  standardized  forms  of  these  d.f.'s,  corresponding 
to  0^  “0  or  1(1*1, 2)  contingent  on  whether  0^  is  a  location  or  scale  para¬ 
meter,  will  be  denoted  as  and  H2,  respectively,  with  their  associated 
continuous  densities  written  as  h^  and  h^.  Thus,  for  example,  F^(x;6)  can  be 
expressed  as  {{^(x-O^)*  If  Is  a  location  parameter,  or  {{^(x/d^).  If  0^ 

Is  a  scale  parameter.  Also,  define  the  standardized  quantile  functions 
(q.f.'s)  and  denslty-quantlle  functions  (d.q.f.'s) 


Q^(u)  ■  ■  Infix  :  H^(x)  u}  ,  0  <  u  <  1,  i“l,2 


(1) 


and 

d^(u)  -  h^lQlii))  ,  0  <  u  <  1  ,  i-1,2  .  (2) 

Our  principal  result  regarding  two  distributions  having  the  same  optimal 
spaclngs  is  provided  by  the  following  theorem. 

Theorem  1.  Let  g^,  1*1,2,  denote  either  d^  or  the  product  of  d^  and  Q^, 

d^*Qj, ,  depending  on  whether  9^  ^  ^  location  or  scale  parameter .  Assume 

that  g^  ^  either  concave  or  convex,  vanishes  at  0  and  1  and  Is  twice 

2  2/3 

continuously  differentiable  on  (0,1)  with  (gp  and  Igp  integrable . 

Under  these  assumptions  and  F2  will  have  the  same  optimal  spaclngs  for 
the  estimation  of  0^  and  62  for  all  k  ^f  and  only  if  there  exists  constants 
0,6  (61*0)  such  that 

gpu)  ■  a  +  6  gpu)  ,  u  c  (0,1).  (3) 

To  exemplify  the  use  of  Theorem  1  consider  the  Weibull  distribution 

Fj^(x;9j^)  -  l-exp{-(x/9^y}  ,  x,v  >  0, 

for  idiich  Hj^(x)  ■  l-expl-x'^},  Qj^(u)  »  {-ln(l-u)  and 

dj^(u)  ■  v(l-u)  [-ln(l-u)  Since  0^^  is  a  scale  parameter  we  use 

gj^(u)  •  dj^(u)Qj^(u)  ■  -v(l-u)ln(l-u) .  A  special  case  of  the  Weibull  is 

the  exponential  distribution  idilch  corresponds  to  v*l.  The  optimal  spaclngs 

for  the  exponential  have  been  tabulated  and  may  be  found,  for  Instance,  in 

Sarhan,  Greenberg  and  Ogawa  (1963).  Taking  g(u)  ■ -(l-u)ln(l-u)  it  follows 

from  (3)  that  these  spaclngs  are  also  optimal  for  the  Weibull  when  vi*l. 

This  relationship  has  also  been  noted  by  Kulldorff  (1973).  Other  results 
obtained  by  Kulldorff  (1973)  also  follow  similarly  from  Theorem  1. 
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As  another  example  consider  the  logistic  distribution  with 

so  that  Hj^(x)  *■  [1 +exp{-irx//3}]  Qj^(u)  “  ln(u/l-u)  and 

gj^(u)  ■  dj^(u)  ■  ir/i/J  u(l-»i) .  By  choosing  g2(u)  «  ^(u),  where  #  and  ^ 

are  the  standard  normal  distribution  and  density  function,  respectively, 

It  Is  seen  that  the  optimal  spaclngs  for  location  parameter  estimation 
for  the  logistic  and  normal  distributions  cannot  be  Identical  for  all  k. 

If  Instead  we  choose  F2(x;02)  *  l-(l+x/62)  x,  v  >  0,  It  follows  that 
the  optimal  spaclngs  for  location  parameter  estimation  for  the  logistic 
are  the  same  as  those  for  scale  parameter  estimation  In  the  Pareto  distri¬ 
bution  If  and  only  If  u*l. 

Two  distributions  will  be  said  to  have  the  same  optimal  solutions 

k+1 

for  Problem  2  If,  for  any  set  of  optimal  strata  boundaries  ^1* 

k+1 

there  Is  a  corresponding  set  ^’'^^2^1-0  optimal  boundaries  for  F2  which 
satisfies 

Such  a  definition  Is  natural  since  It  considers  equivalence  In  terms  of 
percentage  points  that  are  not  Influenced  by  departures  In  the  values 
of  x^j^  and  x^2  ^oe  merely  to  factors  of  location  or  scale.  The  next 
theorem  has  the  consequence  that  a  distribution  Is  essentially  determined 
by  Its  optimal  strata  boundaries. 

Theorem  2.  If.  for  1«1,2,  ^  square  Integrable  and  ■  1/d^  monotone 

—2/3 

and  continuous  on  (0,1)  with  d^^  Integrable ,  then  F^^  and  F2  have  the  same 

gQlutiOO  Isi£  Ixahlsn  2  (in  sense  ^  (4))  ORlT  11  there 

exists  constants  a,B  (6i<0)  such  that 
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Qj^(u)  =0  +  3  Q2(u)  ,  u  e  (0,1)  .  (5) 

Theorem  1  has  the  Implication  that  distributions  with  the  same  optimal 
strata  boundaries  must  be  members  of  the  same  family  which  differ 
by  at  most  factors  of  location  and/or  scale.  VThlle  the  direct 
Implication  of  this  condition  appears  obvious  the  converse,  although 
Intuitive,  Is  somewhat  less  transparent. 

It  Is  also  reasonable  to  ask  under  what  conditions  the  optimal 

spaclngs  for  one  distribution  can  be  obtained  In  terms  of  the  optimal 

k+1 

strata  boundaries  for  another  In  the  sense  that,  If  ^*^^2^1=0 
of  optimal  boundaries  for  F2,  an  optimal  spacing  for  is  provided  by 

“il  ”  •  •  •  .*^+1  • 

Such  conditions  are  provided  by  the  following  theorem. 

Theorem  3.  Let  g^^  denote  either  dj^  ot_  *^1*^1'  depending  on  whether  0^ 

Is  a^  location  or  scale  parameter,  and  assume  that  g^  and  Q2  satisfy 
the  hypotheses  of  Theorem  1  and  2  respectively.  Then  Problem  1  for  F^ 
is  equivalent  to  Problem  2  for  F2  (in  sense  of  (6))  ^or  aU  k 
If  and  only  If  there  exists  constants  a.  3  (31*0)  such  that 

g^(u)  -  a  +  3  Q2(n)  .  n  e  (0,1).  (7) 

As  an  illustration,  note  that  it  follows  from  (7)  that  optimal  spacing 
selection  for  location  parameter  estimation  for  the  logistic  Is  equivalent 
to  the  problem  of  optimal  stratification  for  the  estimation  of  the  mean  of 
a  uniform  distribution  on  any  finite  Interval  [a,b]. 
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3.  Proofs.  In  this  section  Theorems  1-3  will  be  proven.  The  proofs 
are  accomplished  by  a  series  of  three  lemmas. 

Lemma  1.  Let  d  and  Q  denote  the  standardized  d.q.f .  and  q.f .  for  a  distri¬ 
bution  which  depends  on  either  £  location  or  scale  parameter,  6 .  Define  g 
as  d,  for  0  £  location  parameter,  or  d*Q,  for  6  a  scale  parameter,  and 
assume  that  g  vanishes  at  0  and  1  and  is  absolutely  continuous  with  square 
integrable  derivative  g ' .  Then,  the  problem  of  optimal  spacing  selection 
for  the  ABLUE  of  0  ^  equivalent  to  the  selection  of  £  best  set  of  break¬ 
points  for  L2t0tl]  approximation  of  g’  ^  piecewise  constants. 

Proof.  See  Eubank,  Smith  and  Smith  (1981). 

Lemma  2.  Let  Q  denote  the  standardized  q.f.  for  an  absolutely  continuous 
d. f .  F(x;0)  where  0  l£  either  £  location  or  scale  parameter.  Assume 
that  F  has  a  finite  second  moment  and  mean  proportional  to  0  and  that  Q  is 
continuous  and  strictly  monotone  on  (0,1) .  Then,  Problem  2  for  F  ^  equi¬ 
valent  to  selecting  optimal  breakpoints  for  L2[0,l]  approximation  of  Q  by 
piecewise  constants. 

Proof.  First  observe  that,  since  F  has  a  finite  second  moment,  Q  eL2[0,l]. 

For  the  case  of  0  a  location  parameter,  F  may  be  expressed  as  H(x-0),  where 
H  is  the  distribution  function  corresponding  to  Q.  Now,  for  any  set  of 
strata  boundaries  <•••<  it  has  been  shown  by  Dalenius  (1950) 

that  the  variance  of  0  is 


V(0)  -  n’^  [H(x^-0)-H(Xj^_j^-0)]oJ 


(8) 
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where 


[H(Xj^-e)-H(x^_^-0)  ]a^  *  /  x^dH(x-e)  -  [H(x^-e)-H(x^_^-0)  ^ 


-1 


i-1 


/  xdH(x-e) 


"1-1 


Making  the  change  of  variable  x-e*Q(u)  in  (9),  letting  =  H(x^-0) 
and  simplifying  gives 

r  1  1  ^ 


where  I,  ,  is  the  indicator  function  for  (u.  ,,u, ].  Thus,  nV(0) 

is  now  recognized  as  the  squared  L2[0,l]  error  for  the  approximation  of 
Q  by  piecewise  constants  with  breakpoints  at  0“Uq  <  u^  <•••< 

Observing  chat  the  u^  and  x^  are  uniquely  defined  by  0+Q(u^)  *  x^  it  follows 
that  minimization  of  (8)  with  respect  to  the  u^'s  or  the  are  equivalent 

problems.  The  case  of  scale  parameter  estimation  is  proven  similarly. 


As  a  result  of  Lemmas  1  and  2  questions  concerning  the  equivalence 
of  Problems  1  and  2  can  now  be  viewed  as  questions  regarding  the  equivalence 
of  breakpoint  selection  problems  for  L2[0,l]  approximation  by  piecewise 
constants.  This  subject  is  treated  by  the  next  lemma. 


Lemma  3.  Let  m^^  and  m2  ^  square  integrable  functions  and  assume  that, 
for  i>*l,2,  m^  ^  continuous ,  monotone  and  of  one  sign  on  (0,1)  with 
(m^ 1  Integrable.  Then  m^^  and  m2  will  have  the  same  optimal  breakpoints 
for  L2 [ 0 , 1 ]  approximation  by  piecewise  constants  for  all  k  ^  and  only 
if  there  exists  constants  a,8  (8i*0)  such  that 


(9) 


mj^(u)  *0  +  8  m2(u)  ,  u  e  (0,1). 


(10) 


Proof.  The  sufficiency  of  (10)  follows  immediately  upon  noting  that. 


in  this  event,  the  L2[0,l]  errors  for  approximation  of  m^^  and  m2  are 

proportional  with  proportionality  factor  |$|.  To  establish  its  necessity 

let  be  a  sequence  of  sets  of  optimal  breakpoints  for  m^^  where 

Ic  k  k  Ic 

»  ^“0’ ■  ■  ‘ ’'^k+1^  '^k+l”^‘  hypothesis  each  Uj^  is 

also  optimal  for  m2-  Now,  as  in  Barrow  and  Smith  (1978),  define  piece- 
wise  linear  functions  Sj^  with  Sj^(u^)  =  i/k+1,  i=0,...,k+l.  Then,  using 
Theorem  1.1  of  Burchard  and  Hale  (1975)  in  conjunction  with  the  proof 

of  Theorem  3  in  Barrow  and  Smith  (1978),  it  is  seen  that 
T  1 

lim  s  (t)  -  /  |m' (u)  I  ^^^du//  ]m' (t)  )  ^'^^dt.  (11) 

k-x»  ^  0  0  ^ 

However,  as  the  Uj^  are  also  optimal  for  m2,  it  must  be  that 
T  X 

lim  Sj^(t)  =  /  ImMu)!^'^^  du//  Im^(t)  | ^'^^dt.  (12) 

k-»>  0  0 

The  lemma  now  follows  by  equating  (11)  and  (12)  and  differentiating. 

To  prove  Theorem  1  take  ®  ™2  °  ®2  Lemma  3  and  observe  that 

the  convexity  or  concavity  of  g^^  is  equivalent  to  gV  being  of  one  sign 
on  (0,1).  Theorems  2  and  3  can  be  obtained  similarly  by  taking  m^=Q^, 
m2“Q2  and  ®2”^2' 

Remark .  Lemmas  1  and  2  have  the  consequence  that  problems  of  optimal 
spacing  and  stratification,  when  viewed  in  the  quantile  domain,  have 
simple  geometric  interpretations  as  piecewise  constant  approximation 
problems.  Relationships  between  the  solutions  to  these  problems  for 
different  distributions  can,  therefore,  often  be  detected  by  merely 
graphing  the  appropriate  functions. 
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4.  Extension  to  Optimal  Grouping.  The  results  of  Section  2  can  be 
extended  to  include  the  following  optimal  grouping  problem  considered  by 
Kulldorff  (1961). 

Problem  3.  Choose  group  boundaries,  a=XQ  <  x^^  <•••<  Xj^^^=b,  that  minimize 
the  asymptotic  variance  of  the  maximum  likelihood  estimator  of  a  location 
or  scale  parameter,  0,  obtained  using  only  the  group  boundaries  and  the 
proportion  of  sample  elements  within  each  group. 

Theorems  1  and  3  can  also  be  shown  to  apply  to  Problem  3  through  the 
use  of  Theorem  4  of  Adatia  and  Chan  (1981).  Their  result  states  that 
for  a  given  distribution  Problems  1  and  3  are  always  equivalent  (in  the 
sense  of  (6)),  provided  the  distribution  satisfies  certain  regularity 
conditions  specified  by  Kulldorff  (1961,  pg.  20).  Using  this  fact  and 
restricting  attention  to  the  special  case  of  Fj^  =  F2=F,  9^  =  02‘*0, 
d^=d2=d  and  Qj^=Q2=Q,  a  quantile  domain  version -of  Theorem  5  of  Adatia 
and  Chan  (1981)  follows  from  Theorem  3. 

Corollary.  Let  g  represent  either  d  or  d*Q,  depending  on  whether  6  ^ 
a  location  or  scale  parameter,  and  assume  that  g  and  Q  satisfy  the  hypo¬ 
theses  of  Theorem  1  and  2  respectively.  Then  Problems  1-3  are  equivalent 
for  all  k  If  and  only  if  there  exists  constants  a, 6  (S?^0)  such  that 

g'(u)  -  a  +  6  Q(u)  ,  u  e  (0,1).  (13) 

In  comparing  the  Corollary  with  the  results  of  Adatia  and  Chan 
observe  that  our  approach  dispenses  with  conditions  requiring  the  existence 
of  a  sequence  of  strata  boundaries,  (B^},  for  which  the  corresponding 
estimators  satisfy  liinj^_^V(0j^)  »0.  We  note  that  it  follows  immediately 
from  (13)  that  Problems  1-3  are  equivalent  for  either  a  normal  or  gamma 


distribution. 
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